Mini Challenge 3: Cell Phone Calls
Authors and Affiliations:
Student Team: NO Tool(s): For the VAST competition, the analyses were performed primarily in the Palantir Government platform and to a lesser extent in GoogleEarth and the Palantir Finance platform. Both Palantir platforms are being developed by Palantir Technologies, based in Palo Alto, California. Palantir Technologies was founded in 2004 and works with customers across the Intelligence and Finance Communities.
The development team at Palantir made the decision early in the company's history
to develop an analytic platform based on a foundation of openness; a trait
not often seen in the intelligence community. As old institutions transition
into a world where information is increasingly a commodity, the archaic
paradigms of locking down knowledge are giving way to an environment where
analysis is the real power. Palantir Technologies is able to liberate this power
in several concrete ways: The first is data integration - whether structured
or unstructured, Palantir provides standard and extensible interfaces for
bringing information into a common environment. The second is Search and
Discovery, whereby these disparate data stores can be explored as though they
were one. The third is Knowledge Management in which all the knowledge that
is discovered is treated like another data source so no analysis is lost. And
finally, the fourth is Collaboration whereby many analysts working together
can truly leverage their collective mind. Through our open APIs and numerous
(and multiplying) extensibility points, Palantir has succeeded in creating a
genuine platform for application-development and information-analysis.
Phone-2 Characterize the changes in the Catalano/Vidro social structure over the ten day period. Video link: Detailed Answer: The
first thing any cinema fugitive does is chuck the cell phone that
investigators are bound to track. While these scenarios are fictional,
the importance of call records today is not, as Palantir’s customers
regularly deal with massive amounts of SIGINT (signal intelligence).
Through the Intermediary Framework, Palantir allows phone calls to be
viewed either as independent events or as links between entities. It is
not an either/or choice—both states exist simultaneously and can be
viewed depending on which one is more appropriate to the analytic task
at hand (figures 1.1 and 1.2).
Figures 1.1 and 1.2:
Phone calls as a link (left) and as distinct events (right) Our team imported the records
provided as phone calls with references to both phones involved, the time of
the call, and the cell tower that originated the call. We also added
geo-coordinates to the cell towers, allowing us to view the origin of the calls
geospatially (figure 2). Figure 2: All the
calls in 200’s 2nd-order network Given
medium confidence that ID 200 is Paraiso leader Ferdinando Catalano, we decided
to consider it a starting premise that we would attempt to verify later. We
performed egocentric social network analysis on ID 200 by pulling all
first-order connections into the Graph View (anyone directly linked to Catalano
by a phone call) (figure 3). Figure 3: 200’s
immediate network (line thickness denotes number of links) An
organization head often only communicates directly with his elites, and
the immediate network revealed is indeed very tight. The member-node
who has talked most often (14 calls) to 200 is number 5, and our
intelligence suggested that brother Esteban Catalano was most likely to
hold this position. We also knew that David Vidro is Catalano’s deputy
and, therefore, expected him to have the second greatest number of
communications. From this, we established that David most likely
possesses ID 1. Palantir’s histogram, which provides high-level
overviews of data selections and their commonalities, made drawing
these conclusions relatively straightforward (figure 4). ID numbers 2
and 3 are directly linked to David, so we suspect that they belong to
Juan and Jorge Vidro. Unfortunately, we do not have sufficient
information to distinguish between these two brothers. Figure 4: 1st-order
network histogram The next step in the investigation
was an expansion to second-order connections (anyone connected to Catalano or
one of his connections - figure 5). Figure 5: 2nd-order
network Pulling
up our Timeline View immediately revealed two details about the phone
call record: a cyclical rise/fall of calls and a sudden drop in traffic
during the last three days. The daily pattern is intuitive, as calls
are less frequent during the night. The network silence starting on
Thursday, however, does not have a similarly apparent explanation. The
previous Thursday, Friday, and Saturday had high levels of traffic, but
at the end of the time period, IDs 1, 2, 3, and 5 were barely active in
the inner network, and most of the few calls that were being made
passed through ID 137. Potential explanations include the movement
keeping a low profile, a sudden shift in the power structure, or—most
likely—a decision to change cell phones. Figure 6.1: All 400
callers, with an Auto-Layout (this jumbled layout means they are all calling
each other) Figure 6.2: The
timeline of the total network (7.1) At
this point, we needed to compare our proposed inner-circle to the
overall network. We opened all 400 people and searched for all links
between them. The results were important on several counts (figures 6.1
and 6.2): first, ID 1 had by far the largest number of calls with #5 in
second place and #200 quite low on the list. Those positions confirmed
our initial ID assignments, as brother Esteban (5) should be replaced
by deputy David (1) in the top position on the macro-scale and leader
Ferdinando (200) should not be making very many calls overall. The
second item we noticed from the overall network view was that there was
no drop in traffic during the last several days, bolstering the idea
that the inner-network simply changed phones. To
check this hypothesis, we reverted with the History tool to all 400
entities without connections between them. We then performed a “Search
Around,” (figure 7.1) but only looking at calls placed after Thursday.
We discovered an entirely different set of key players in the Histogram
here (including 309, 306, 360, 0, and 397). ID 0 had been active
throughout but the others were new so we removed everyone but them from
the graph. When we compared the immediate network of 309, 306, 360, and
397 to that of 1, 2, 3, and 5, we received an almost identical image
(figures 7.2 and 7.3) with an inverse Timeline (figures 7.4 and 7.5). Figure 7.1: Search
Around: find entities linked by phone calls within the specified time window Figures 7.2 and 7.3:
Compare 200’s network (left) with 300’s network (right) Figures 7.4 and 7.5:
Timeline for this new network ID
300, placed in the middle and connected to all four 300-series
entities, was clearly Ferdinando Catalano, so we started with a fresh
investigation and used our workflow from ID 200 on ID 300. This smaller
subset of the records was slightly more ambiguous; ID 268 conversed by
far the most with 300, suggesting he might be Esteban, but is hardly
connected to anyone else in the network unlike ID 5. 309’s network very
closely mirrors 1’s and uniquely connects to all the nodes in 300’s
network but talks the least of anyone to 300. Ultimately, we decided
that the first-order network over a 3-day period was less reliable than
the second-order network—David Vidro didn’t forget what to do because
he got a new cell phone. Before making the final decisions, however, we
decided to check our geospatial records. We
viewed the links between callers as phone call events and moved all
dialed calls to Google Earth because we only have the originators’
cell towers. We then compared the movements of the initial and final
version of the network; unfortunately, the results were of little help
for two reasons. First, in the initial network, only ID 3 frequently
leaves the center of the island, so the initial patterns are fairly
similar. Second, in the final network, everyone begins to move more
frequently. This is potentially significant for the Movement, but
leaves some guesswork in the final assignments. We then decided to
compare the members of the old and new networks and realized that
everyone outside the inner network had kept the same IDs. That proved
309 was formerly 1 (David), 306 was 5 (Esteban), 397 was formerly 2
(Juan or Jorge), and 360 was 3 (the remainder of Juan/Jorge), and 300
was—of course—200, or Ferdinando. |
|||||